DOCUMENT RESUME 



ED 048 639 



EA 003 349 



AUTHOR 

TITLE 



PUB DATE 
NOTE 



Mann, Stuart H. 

Least-Cost Decision Buies for Dynamic Information 
Management. Working Paper. 

£70] 

1 6 p- ; Paper presented at Operational Research 
Society of America National Meeting (38th, Detroit, 
Michigan, October 28-30* 1970) 



EDRS PRICE 
DESCRIPTORS 



EDRS Price MF-S0.65 HC-$3.29 
♦Cost Effectiveness, Information Retrieval, 
Information Storage, *Information Utilization, 
♦Mathematical Models, ♦Technical Reports 




programing model of an information system. The program is constrained 
to provide for a minimum acceptable level of user benefits. Knowledge 
of the physical size of the primary storage area and the fraction of 
documents returned from secondary to primary storage in each decision 
period is required. Transfer, handling, and circulation costs are 
considered. Increase in the total size of the document collection is 
assumed to be an uncontrolled random process. (Author) 



ERIC 



ED048639 



U.S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE 
OFFICE OF EDUCATION 






THIS DOCUMENT HAS BEEN REPRODUCED EXACTLY AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING IT. POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY. 



Least— Cost Decision Rules for Dynamic Information Management 



Stuart H. Mann 



Division of Man-Environment Relations 
The Pennsylvania State University 
University Park, Pennsylvania 16802 



-PERMISSION TO REPROOUCE THIS COPY- 
RIGHTED MATERIAL HAS BEEN GRANTED 

St'U&jrC (4 i A-fa vo n 



TO ERfC ANO ORGANIZATIONS OPERATING 
UNOER AGREEMENTS WITH THE U.S. OFFICE 
OF EDUCATION. FURTHER REPRODUCTION 
OUTSIOE THE ERIC SYSTEM REQUIRES PER- 
MISSION OF THE COPYRIGHT OWNER.” 



ft 

8 

© 






This research has been supported partially under 
National Science Foundation., Office of Science 
Information Grant GN— 759 as administered by the 
School of Industrial Engineering, Purdue University. 



This is a working paper. Comments are invited and 
should be directed to the author at the above address. 
Please do not reproduce or quote, without the author's 
permission. 




Presented at 38th National Meeting Operations Research Society of 
America, Detroit, Michigan, October 28-30, 197C. 



1 



ABSTRACT 



Least cost decision rules for transferring documents 
from primary to secondary storage are developed from a dynamic 
programming model of an information rystem. The program is 
constrained to provide for a minimum acceptable level of user 
benefits. Knowledge of the physical size of the primary 
storage area and the fraction of documents returned from 
secondary to primary storage in each decision period is re- 
quired, Transfer, handling, and circulation costs are con- 
sidered, Increase in the total size of the document collec- 
tion is assumed to be an uncontrolled random process. 



INTRODUCTION 

The information explosion of the last few years has resulted in 
considerable research effort being directed toward the purchase and 
accumulation decisions faced by information centers. Two recent papers, 
one by W, C, Lister^ and the other by H, M. Gurk and J, Minker {^address 
.this concern. 

In the Lister paper, the problem of least-cost decisions for trans- 
ferring information from primary storage areas to less accessible secondary 
storage areas is studied. He presents several models under varied assump- 
tions. However, the return of information from secondary to primary storage 
is not permitted in any of Lister's models. 

In contrast, Gurk and Minker investigate the size of primary storage 
areas when return of information from secondary to primary storage, when 
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certain given conditions are satisfied, is allowed to occur. However, 
no attempt is made to identify best storage policies. 

In this paper, we will incorporate the Lister and Gurk and Minker 
ideas and allow information to flow in both directions. In the following 
sections this main structure is expounded and exploited to identify optimal 
storage and transfer policies. 

PROBLEM STATEMENT 

Consider an information center with a fixed size storage area for 
fast access document retrieval. We will call this area the primary storage 
area. In the case of a library, this area might be the stacks or in the 
case of a computer facility the area would be the disc or drum. There is 
available to this information center another storage area, called secon- 
dary storage. This storage facility is less accessible than primary 
storage and is assumed to be unlimited in its available storage capacity. 

Into the primary storage area of this information center new documents 
flow from an uncontrolled random process. This can be thought of as the 
blanket-order system for libraries wherein virtually all published books 
are received and processed for the collection held in primary storage. 

With a fixed size primary storage area and rapid increases in document 
input, an imbalance soon occurs unless space is made available. Space 
can be made available by transferring documents from primary storage to 
secondary storage. 

If we allow documents from secondary storage to be returned to primary 
storage when they meet set decision criteria, then there becomes another 
input source for primary storage. This input compounds the problem of an 
O already overcrowded primary storage area. 

ERJC 



Now, given that we must not allow the collection in primary storage 
to drop below some fixed critical level and that a known number or expected 
number of documents from secondary storage are returned each decision 
period, the decision that must be made is how many documents do we transfer 
from primary to secondary storage in each decision period. It is assumed 
that decisions are made at the beginning of time periods of equal length, 
perhaps monthly. 

The forces inhibiting arbitrary transfer of documents from one storage 
area to the other are the inherent costs involved. There are four major 
costs that we will consider. There are two costs involved with the circu- 
lation and handling of documents. One cost is the charge to the system 
for the circulation and handling of documents in the primary storage area 
expressed as a function of the number of documents contc therein. The 

other cost is the corresponding charge to the system for circulation and 
handling of documents in the secondary storage area. The other two costs are 
realized upon the transfer of documents. One charge is made for the transfer 
of documents from primary to secondary storage, and the other charge is for 
the transfer of documents from secondary to primary storage. Both of 
Jthese costs are expressed as functions of _.the,.number of documents transferred. 

The objective can now be stated as follows: find the number of documents to 
transfer to secondary storage each decision period so as to minimize the 
total of circulation, handling, and transfer costs in maintaining the 
primary and secondary collections given that the size of the primary collec- 
tion must be no less than some minimum acceptable level. In the following 
section we will develop the mathematical model for the described system 
and describe the form of the optimal policy under given conditions. 
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MATHEMATICAL MODEL 

The following necessary assumptions are made: (a) the fraction of 

secondary storage documents that are transferred to primary storage in 
each period is a fixed known value; (b) documents to be transferred from 
primary to secondary storage will be moved on the basis of age, the oldest 
■moving first; (c) documents moved from secondary to primary storage are 
considered as new documents in primary storage. 

The following parameters , variables, and functions are identified 
for subsequent use: 

P - maximum workable size of primary storage area; 

$ - fraction of primary collection that must be maintained 

to insure minimum level of user acceptability (0 < $ < 1); 
r} - fraction of secondary collection that is transferred to 
primary storage in each decision period (0 < H < 1); 

£ - random variable for the number of new documents as input 
to the primary collection from an external source; 

— probability mass function for the random variable 5, 

K = 0,1,2, .. . ,n; 

X - size of the primary collection at the beginning of a 

decision period, just before a transfer decision is made; 

0) - .size of the secondary collection at the beginning of a 

decision period, just before a transfer decision is made; 

v. - number of documents transferred from primary to secondary 
j " 

storage at the beginning of decision period j., j = l,2,...,n. 

It may be desirable to include an external provision that would maintain 
a document in primary storage if if. had experienced considerable use, even 
if it were eligible for transfer to secondary storage. 
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The costs imposed on the system are as follows: 

p - circulation and handling charge per document in primary 
storage; 

s - circulation and handling charge per document in secondary 
storage; 

tj - cost per document transferred from primary to secondary 
storage; 

t 2 - cost per document transferred from secondary to primary 
storage, with all p, s, t^ and t 2 > 0. 

Since decisions are to be made at the beginning of each of n equal 
length decision periods, it is convenient to model the process as a dynamic 
program We will number the decision periods backward in the usual way. 
Let f^(XjOj) be defined as the minimum total expected cost with j decision 
periods remaining, starting with X documents in primary storage and to 
documents in secondary storage. Define f Q (x,to) to be zero for all X and 
to. . 

The time sequence of events is as follows: decision period begins 

with state of system observed as document levels in primary and secondary • 
storage; decisions made simultaneously for transfer of documents; documents 
transferred; costs charged on new document levels and amounts transferred; 
random input into primary storage; end of decision period. For a single 
decision period we have 



fi(X.w) = 



min 

y i 1 



p(x + nw - yj) + s(w + yj - nw) + + t 2 nw 



a) 



where the decision variable y 1 must satisfy 



•G 



- 6 - 



gp < X + nw + E(C) - yj < p 



( 2 ) 



and 



yi 



> G. 



(3) 



Since the decision for the transfer of documents to secondary storage 
must take into account the maximum size of the primary storage area, the 
knowledge of what is to be received by primary storage during the decision 
period is incorporated into constraint (2) . This constraint forces the 
transfer of documents to be large enough to enable the size of the primary 
collection not to exceed its upper bound at any time during the decision 
period. At the same time constraint (2) requires that a minimum size 
primary collection be maintained. Constraint (3) states that negative 
amounts cannot be transferred. 



Now, if the objective function (1) is rewritten as 



min 



fj (X»“) “• j (s - p + tj)yj + PX + [(p + t 2 )n + s(l - n)]w 



(4) 



subject to (2) and (3), it is obvious that the optimal policy for this 
single decision period deper-s only on the coefficients p, s, and t x . 

There are three cases to consider: Case (1) s>p; since t x > 0 and s > p, 

s - p + tj >0 and this positive coefficient implies that y l should be 
made as small as possible, i.e., max[0, X + W + E(£) “ P] • Case (2) p > s, 
s + t x > p; again this implies s - p + t x > 0 which yields the same optimal 
policy as case (1). Case (3) p > s + t l ; this implies s - p + t x < 0 which 
indicates that y l should be made as large as possible, i.e., optimal 

y x = x + no: + e(£) - gp. 
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For a decision process of n periods duration, we have the following 
objective function: 






(s - p + tj)y n + px + [ Cp + t 2 )n + s(i - n)]u 



+ 1 (x + no) + 5 - y n , w + y n - nw)<KC) 



(5) 



subject to 



BP < x + nw + E(0 - y n < P 



and y > 0. 
J n *= 



Lemma 1. 



The function is linear in X and for all nr 



Proof . 



We have assumed f fi (x,w) = 0 for all x and u). 



fi(x.w) = 



rain 

yieYi 



(s-p + ti)yi+pX+ l(P+t2)ri + s (l-r|)]to 



) 



where Y. = . 
3 



y : $P-<X + noj + E(£) - y. < Pr>y. > o, j=l,2,., 
j - 3 J * 



Since the quantity in brackets {} is linear in y lt the optimal y lt 
y* = max[0, X + + E(£) - P] or X + W + E(£) - &P. In the case that 

y* = 0, then fj(x*w) = pX + t (p + 1 2 )ri + s(l - i t , jU) which can be written 
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as K i + I^x + HjW where ^ = 0, L = p, and H a = [ (p + t 2 )n + s(l - n)]. 

In the case that y* = X + T)w + E(£) - P, f 1 (x,w) = (s - p + t x ) tx + TKO 

+ E(£) - P] + px + [ (p + t 2 )r) + s(l - r))]oo, which can be written as 

K i + ^iX + MjO) where K x = (s p + t )[E(£) - P] s Lj = s + t 1# and Mj = 

(tj + t 2 )n + s. When y* = x + W + E(£) - pP, f 1 (x,w) = (s - p + t^ x 

[X - no) + E(£) - BP] + PX + [ (p + t 2 )r) + s(l - r))]o3, which can be written 

as Kj + LjX + MjO) where K, = (s - p + t 1 )[E(0 - $P] , L x = s + tj , and 

= (t j + t^)ri + s. In each case f (x»-w) is linear in x and w. As an 

induction assumption, assume f (x>^) = K + L y +M a) where K , 

n~i A n-i n-i A n-i n-i 

L » and M are functions of s, p, t^, t^ , rj, B> P and E(£) only. Since 

X_ .. = X_ + nw + C - y and u> = y + (1 - rj )w • 

n— in n n— in n 

( s - p + ti ) y 

n 

+ px + [(p + t 2 )n + s(i - n)]w + 



min 

f n ( X’ U) = y neYn 



-l 



tK n-r +L n-i (x ^ nt0 + ? - y n ) 



+ (y n + (l - n)w)]<}>(£) - 



min 

y eY 
J n n 



(s-p + ti+M - L ) y 
n-i n-i n 
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+ (p + L n-1 )x + [ (p + t 2 )n + s(i - n) + nL n-i 

+ (i - n)M ]u + k + e(5)l . • 

n— i n— l n - L 

4 

Since the quantity inside the brackets {} is linear in y , y* must equal 
one of the endpoints, i.e., x + T1W + E(£) - $P or max[0, X + HU + E(£) - P] . 

Thus, f n (x,u) = ( 1 ) : (s - p + t x )[x + no) + E(?) - gP] + px + [ (p + t 2 )n 

+ s(l - n)]u + K n-1 + gPL n _ i + [x + 0 ) + E(C) - BP]M n _ i , or ( 2 ): pX + [ (p + 

t,)n + s (i - n)]u + k + [x + nw + e(C)]l + (i - n)wM .or 

4 n— i n - 1 n— i 

(3): (s - p + t^Ix + 1*0 + E(£) - P] + PX + [ (p + t 2 )n + s(l - n)]<o + K n _ x 

+ PL n _ 1 + [X + OJ + E(£) - P]M n _ i . 

In each of the three cases it is observed that f (Xj^) i s linear in 

n 

Y and a) and can be written as K + L y + M w. In case (1); K = (s - p 
A n n A n n 

+ t + M _)[E(S) - BP] + BPL + K , L - s + t + M n , and H * 
i n— l u— l n - 1 n i n— i n 

(t, + t,)n + s + M In case (2) ; K = E(^)L + K , L = p + L , 

l 2 n-i v n vw n-i n-i n v n-i* 

and M « (p + t )n + s(l - n) + nL + (1 - n)M , . In case (3); 

n 2 n— i n— i 

K = (s — ' p + t -f M ) [E(C> - P] + PL + K , L = s + t + M , and 
n ' r l n-i n _i n-i n n-i 

M = (t, + t,)ri + s + M 
n v i 2 7 n-i 



Lemma 2. 
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(a) When s > p, thens •- p + t + M - L >0, for all n. 

i n n 

(b) When p > s, s + t 2 > p, and t < X](t l + t £ ), then s - p + 



+ M - L >0, for all n. 
n n 



(c) Whenp - s > rnaxt^, X](t l + t 2 )], then s-p + t^ + M n - L n < 0, 



for all n* 
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Proof . 

(a) The proof will be by induction. Prom the objective function (4) 
it is seen that there are two cases to consider when n - 1. 

Case (1): y x = 0. 

s-p + t 1 +M 1 -L 1 = s- p + t 1 + (p + t 2 )n + s(i - n) - p 

= (2 - n)(s - p) + t x + t 2 n > o. 

Case (2): y* = X + nw + E(£) - P. 

s-p + t 1 +M 1 -Lj = s- p + t 1 + (tj + t 2 )n + s - s + tj 

= s - p.+ 2tj + (t t + t 2 )r| > 0. 

As the induction assumption, assume that s - p + t ? + M n-1 - > 0. 

To evaluate s-p+t +M -L, there are three cases to consider. 

l n n 



Case (1): y* = x + W + E(£) - $P. 



n 



- p + t x + M n - L n = s - P + t x + (t x + t 2 )n + s + M n _ i 
- s - p + (tj + t 2 )r| > 0,. 



- s - t - M 

.1 n- i 



Case (2) : y* - 0. 

n 



- p + tj + M n - L n = s - p + t 1 + (p + t 2 )Ti + s(l - ti) + TiL n _ i 



+ « - "’Vi ’ ■" .-Vi . 

Cs - p + tj + M n _ 1 - l m )(1 - n) 
+ s - p + nCtj + t 2 ). 



Thus, by the induct!?", assumption, s - p + t + M - L > 0. 

“ n n 

Case (3): y* = X + nw + E(C) - P. 

Same as Case (1) . 



Therefore 



. s - p + t +M - L > 0 for all n, when s > p. 
* r 1 n n 
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(b) The proof will be by induction. From the objective function (4) 
it is seen that when n = 1 there are two cases to consider. 

Case (1): y* = 0. 

s - p + tj + Mj - Lj = s - p + tj + (p + t 2 )r) + s (1 - r|) - p 

= (s - p + tj) + (i - n)(s - p) + t 2 n 
> (s - p + tj) + (l - n)(-tj) + t 2 n 

= s - p + tj + n(tj + t 2 ) - t l > o. 

Case (2): y* = x + W + E(£) - P. 

s-p + t^+M^ - = s - p + tj + ( t l + t 2 )r] + s - s + tj 

® 8 - p + 2tj + (tj + t 2 )n > o. 

As the induction assumption, assume that s - p + tj + M x - ^ n _i > 

To evaluate s - p + t, + M -L, there are three cases to consider. 

1 n n 

Case (1): y* = x + W + E(£) - BP. 

s-p + t 1 +H n -L n = s- p + t 1 + (tj + t 2 )r) + s + ^ n-1 - s - 

= s - p + tj + (tj + t 2 )n - tj > 0. 



Case (2): y* = 0. 



n 

s-p+t 1 +M n -L n = s- p + t 1 + (p + t 2 )n + s(l - ri) 



+ nL n-! + (1 " n)M n-i " P ' L n-i 

- (s - p + t, +H n _ i - l m )<i - n) 

+ s - p + t 1 + n(tj + t 2 ) - t l > o. 



Case (3): y fl = X “ nw + E(£) - P. 

Same as Case (1) . 



Thus, s-p + t +M - L >0 for all n, -when p > s, s + t, > p and 
* v i n n 1 

tj < n(tj + t 2 ). 




12 



- 12 - 



(c) The proof will be by induction. From the objective function (4) 
it is seen that when n = 1, y* » X + W + E (£) “ $P. 

s - P + + M x - L 2 = s - p E t 1 +(t x + t 2 )Ti + s* - s - t x 

= s - p + T1 (t 2 + t 2 ) < 0. 

As the induction assumption assume that s - p + t, + M , - L , < 0. To 

1 n-i n-i 

evaluate s ~ p + t. + M - L , there are three cases to consider. 

1 n n 

Case (1): y* = X + nu + E(£) - $P. 

s - p + tj +M n -L n = s- p + t 1 + (tj + t 2 )Ti + s 

+ M -s-t -M 



n-i l n-i 

* S - P + T)(t l + t 2 ) < 0. 



Case (2 )s y* = 0. 

'n 

s - p + t + M 
l n 



L 

n 



s - P + t 1 

+ *T|L + 
n-i 

(s — p + t 
- P + T)(tj 



+ (p + t 2 )n + s(l - n). 

(1 . n)Mn _ i - p - L n __ 

. + V , - L »-, )(1 - n ) + » 

+ t 2 ) < 0. 



Case (3): y* « x + W + E(£) - P. 

Same as in Case (1) . 

Therefore, s - p + t x + M n - L n < 0 for all n, when p - s > max[t 1 ,Ti(t 1 + t £ )] 



Theorem 

The optimal transfer policy in each period of an n-period process 
takes the following form (a) when either s > p or p > s, s + t 1 >p and 
t x < n(t 1 + t 2 ): 
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Then if 



X + Tia) + E(£) > P, transfer x + tiu) + E(£) - P to secondary storage 



X + + E(£) <_ P, do nothing. 

(b) When p - s > max[ ^ (t^ + t )], transfer x + tiu) + E(£) - $P to 
secondary storage. 



Proof 



(a) From Lemma 1 it is seen that f n (X> w ) can be written as 



f n CX.w) ■ 



min 

y eY 
*i n 



( s - p + ti + M n _ x - 



L ) y 
n-i ' 



+ (p + L n _ x )x + I(p + t 2 + + (s + M n _ a )(l - T|) ] (0 

+ K + E(?)L 
n-l n-i 

Since the quantity within the brackets {} is linear in y , the solution 

must occur at an endpoint. The endpoint is dependent solely on whether 

.._the coefficient ...of _y n „. is j.osijtive_.or_negative. By_Lemma 2, parts (a) and 

(b) , the coefficient is always positive under either of the given conditions. 

Thus, to minimize the quantity within the brackets {}, y should be made 

n 

it 

as~small as possible. Therefore ? -y « maxfO, X + + E(£) - P] , and the 

optimal policy results. 

(b) The reasoning is identical ta o part (a) with the exception that 
"“ now "the coefficient of y n ~ is negative under the “given conditions as shown 
in Lemma 2, part (c) . Thus, y n should be made as large as possible. 
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Therefore, y* * X + 4 E(£) “ (3P. 

Tl 

CONCLUSION 

Sufficient conditions for simple operating rules are given 
by the theorem. These conditions are dependent on the cost 
parameters of the system and the fraction of documents returning 
to primary storage from secondary storage in each decision period. 
In addition to these parameters, implementation would require 
knowledge of the expected value of the number of new arrivals to 
the system in each period of the process. 
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